Comparative study of the transcriptional regulatory networks of E. coli and yeast: 
Structural characteristics leading to marginal dynamic stability 
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Dynamical properties of the transcriptional regulatory network of Escherichia coli and Saccha- 
romyces cerevisiae axe studied within the framework of random Boolean functions. The dynamical 
response of these networks to a single point mutation is characterized by the number of mutated 
elements as a function of time and the distribution of the relaxation time to a new stationary state, 
which turn out to be different in both networks. Comparison with the behavior of randomized 
networks reveals relevant structural characteristics other than the mean connectivity, namely the 
organization of circuits and the functional form of the in-degree distribution. The abundance of 
single-element circuits in E. colt and the power-law in-degree distribution of S. cerevisiae shift their 
dynamics towards marginal stability overcoming the restrictions imposed by their mean connectiv- 
ities, which is argued to be related to the simultaneous presence of robustness and adaptivity in 
living organisms. 



INTRODUCTION 

Living organisms depend simultaneously on a stable in- 
ternal environment and a capability to adapt to a fluctu- 
ating external environment [l| . Since the biological char- 
acteristics of an organism are determined by the interplay 
between its gene repertoire and the regulatory appara- 
tus 0, robustness and adaptiveness should be generic 
features of the molecular interactions composing the gene 
regulation machinery. The organization of the gene tran- 
scriptional regulatory network has been analyzed for nu- 
merous organisms, in particular for the prokaryote Es- 
cherichia coli {E. coli) H, ^ IS and the eukaryote Sac- 
charomyces cerevisiae {S. cerevisiae) 

Adaptivity of an organism implies the production of 
different cell types with different functions from the same 
genome. This begins with a regulated transcription by 
certain proteins, transcriptional factor (TF) The 
identification of the target genes for each TF allows the 
construction of a gene transcriptional regulatory net- 
work, where the nodes are the genes or operons that 
produce TF's or are regulated by TF's, and the directed 
edges indicate a regulatory dependence: A directed edge 
from node A to node B implies that a TF encoded by 
gene A is involved in the regulation if the expression of 
gene B. The expression level of each gene defines the dy- 
namical state of the network. To achieve robustness and 
adaptiveness at the same time one expects the regulatory 
network dynamics to be neither chaotic nor fully insensi- 
tive to perturbations, but marginally stable. Structural 
characteristics of the network must support these dynam- 
ical features. 

Our study reveals specific topological features in the 
transcriptional regulatory network architecture of E. 
coli and S. cerevisiae that shift the dynamics towards 
marginal stability. E. coli's network has a very low mean 
connectivity, the number of edges per node, which would 
lead in random networks to a high stability thus deterio- 



rating adaptiveness. But we find that single-element cir- 
cuits which are anomalously rich in E. colics network help 
mutations triggered by random perturbations to persist, 
favoring an unstable dynamical behavior. S. cerevisiae 
on the other hand has a sufficiently high mean connec- 
tivity which favors chaotic dynamics in random networks 
deteriorating stability. Here we find that S. cerevisiae^s 
network has a broad (algebraic) node degree distribution 
and we demonstrate the stabilizing effect of this feature 
upon the dynamics. 

Practically, the information about the transcriptional 
regulatory network structure - which TF binds to which 
gene - is available via the chromatin-immunoprecipitation 
microarray experiments 0. The question, whether a 
specific TF enforces or inhibits the expression of a spe- 
cific target gene, has to be studied separately. However, 
those individual interactions do not necessarily occur in- 
dependently and these regulatory interactions are often 
combinatorial 10] and time-, cell cycle-, or environment- 
dependent, limiting the available information on the com- 
plete regulation profile. Generic dynamical features then 
have to be extracted using model interactions as sug- 
gested by Kauffman One digitizes the continuous 
expression level to a Boolean variable, (inactive) and 1 
(active), and assumes a random static regulation rule for 
each gene in the form of a random Boolean function for 
each gene determining its state at the next time step by 
the current states of its regulators. Here random means 
that the output value of these Boolean functions is or 
1 with equal probabilities. 

Based on considerations of random Boolean networks 
with a fixed number of regulators k for every ele- 
ment, Kauffman \vi\ hypothesized that distinct station- 
ary states - limit cycles - correspond to different types 
of cells. This idea got some support from the agreement 
of the scaling behavior of the number of limit-cycles for 
k — 2-random Boolean networks and the number of cell 
types with respect to the genome size, but was also de- 
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FIG. 1: An example of Boolean dynamics, (a) A Boolean 
network of four nodes and three directed edges. Each node 
has a Boolean variable ai (i = A, B, C, D) (b) Regulating 
rules /i's determining the node i's state at time t + 1 with 
its regulators' states at time t as input. The nodes A and B 
have no regulator and their Boolean variables take constant 
values, respectively, at time t+1 regardless of their values at 
time t. (c) An example of the time evolution of those Boolean 
variables under the regulating rules in (b). 



bated 0, 0|. Among networks with fixed in-degree, 
fc = 2 is a critical point distinguishing two different dy- 
namical phases: stable and unstable against perturba- 
tions, suggesting that the regulatory network dynamics 
of living organisms is "on the edge" between order and 
chaos llj. 

However, real regulatory networks do not have a fixed 
in-degree but a heterogeneous connectivity, even their 
average in-degree (fc) is usually different from 2. Never- 
theless the Boolean model itself is useful, and recently the 
effects of the nature of the regulating rules on the dynam- 
ical stability were studied within its framework il5»] . 
We propose that the network structure itself is also rele- 
vant for the stability /instability aspect mentioned before. 
Therefore we construct a network from the data for the 
transcriptional regulatory interactions for E. coli and 5*. 
cerevisiae, and study how a point mutation, i.e., an al- 
tered dynamical state of a single element, spreads over 
the whole network by inducing another mutation through 
regulatory interactions. 



ature search. The resultant network consists of 418 oper- 
ons and 519 interactions with 111 nodes having at least 
one outward edge. The data for S. cerevisiae are taken 
from Ref. and were obtained from the combination 
of Chromatin Immunoprecipitation and DNA microarray 
analysis. We chose the P value threshold 0.01, yielding a 
network of 4555 nodes and 12455 directed edges with 112 
nodes having at least one outward edge. Isolated nodes 
and those possessing only self-regulation have been ex- 
cluded in both networks since they have no interaction 
with other elements. 

Random Boolean functions — These experimental data 
establish a directed network G oi N nodes, and we as- 
sign a dynamic Boolean variable ai (that can take on 
the values or 1 only, corresponding to an inactive or 
active state, respectively) to each node i. These dy- 
namical variables evolve synchronously via <Ti{t + 1) = 
fi{ai^ (i), CTij, (t), . . . , cTife^ (0), with the nodes zi, 12, . . . , ifc, 
having the outward edges incident on the node i. 
The output value of fi for each input configuration 
{(Tij^ (i), (7^2 (t), . . . , (Ji^, (t)} is with probability p or 1 
with probability I — p, which is determined at the begin- 
ning and not changed with time. If ki = 0, ai is fixed 
at fii ai[t + 1) = fi regardless of the value of ai{t). The 
parameter p characterizes the randomness of the regulat- 
ing rules: If p = or 1, the dynamics is frozen while the 
system tends to be disordered with p = 1/2. An example 
network with this Boolean dynamics is given in Fig. ^ 

Stability measure — The stability of a time-trajectory 
S(t) is assessed by the effects of a point mutation 
ai — > 1 — (7i on the dynamical evolution of the sub- 
sequent states. For this, we choose a configuration 
S — {(Ti, f72, ■ • • , fAf}, and prepare its mutant, E — 
{(Ti, (72, . . . , (Tat}, where ai = ai for all i except j with 
j chosen arbitrarily. Evolving S and S on the same net- 
work with the same regulating rules, we count rtm(i), the 
number of elements i's with ai{t) ^ ai(t), at each time 
step t. A node with Aai{t) = \ai{t) — ai{t)\ > is con- 
sidered as mutated. We average nm{t) over different real- 
izations of the regulating rules and different initial pairs 
of configurations to get the average, Njn{t) = {umit)), 
which converges to its stationary value N^. For each in- 
dividual normal-mutant pair (E, S), one can measure the 
relaxation time after which {t) reaches its stationary 
value. Its distribution P(tr) is investigated as well. 



RESULTS 



Time evolution of the number of mutated elements 



METHOD 

Datasets — For the transcriptional regulatory network 
in E. coli, we used the data of Ref. 5] , which are based on 
an existing database, RegulonDB, and enhanced by liter- 



Figure 121 (a) and (b) present the results for the num- 
ber of mutated elements A^m(0 and TVm- N^{t) decreases 
very rapidly from A^i„(0) = 1 to a much smaller value for 
all p's in E. coli. On the other hand, for S. cere- 
visiae increases with time up to a value larger than 1 for 
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FIG. 2: Number of mutated elements Nin{t) and Nm ~ 
limt^oo A'm(i) and distribution of the relaxation time P{tr)- 
(a) Plot of the stationary value A^m versus A = 2p(l — p) in 
the original network and two types of randomized graphs (see 
the text for the definition) for E. coli. The data are averages 
over 10^ initial pairs of configurations for each of more than 
10^ realizations of regulating rules. The approximation given 
in Eq. Q is drawn together. The inset shows the time devel- 
opments Nn^{t) for selected values of A in the original E. coli 
network, (b) The same data as (a) for S. cerevisiae. (c) Plots 
of P{ti) with p = 1/2 {X — 1/2) on the original networks and 
the randomized graphs for E. coli and S. cerevisiae. 



A = 2p{l -p)> 0.42 (0.3 <p< 0.7) indicating the oc- 
currence of a mutation cascade. Both in E. coli and S. 
cerevisiae, increases with increasing p from to 1/2 
(or decreasing p from 1 to 1/2) since the probability that 
a regulating rule yields different output values from dif- 
ferent input configurations is 2p{l ~p), which has a max- 
imum at p = 1/2 and will be denoted by A. In E. coli, 
iVm stays smaller than 0.3, indicating that system-wide 
mutations are suppressed. FigureElalso shows that in 5*. 
cerevisiae TVm is smaller than in E. coli for A < 0.2 but 
increases with A more rapidly and is larger for A > 0.2. 

The functional form of P(ir) for p = 1/2 in Fig.|21(c) is 
strikingly different between E. coli and S. cerevisiae: it is 
exponential for E. coli and a power-law, P(tr) ~ 
for S. cerevisiae. This long tail of P(ir) implies that in 
the case of S. cerevisiae an element can be mutated and 
recover even at very late times in the dynamics. 



Mean connectivity 

These differences in the mutation spread dynam- 
ics may be primarily attributed to a difference in 
the mean connectivity and can be understood by a 
mean-field approach m [13: The probability H{t) = 
limjv^oo N^{t)/N that a randomly chosen node i is mu- 
tated at time t, also called the Hamming distance, is 
given in terms of the probability that a regulator of the 
node i is mutated, which we denote by H{t), and the 
probability that the regulating rule fi yields different out- 
put values from different input configurations. A, as 

H{t + 1) = Y,X{l-{l-H{t))'')Pd{k), 

kin 

H{t + 1) = ^A(l-(l-i7(0)'=)^^#^. (1) 



k.q 



Here Pd{k,q) is the joint probability that a node has 
in-degree k and out-degree q and is related to the in- 
degree distribution Pd{k) = ^qPd{k,q). H{t) and H{t) 
evolve towards their stationary values H and H . Setting 
H{t + 1) = H{t) = H and expanding the second line 
of Eq. O for small H, one finds H ~ HX{kq)/{q) - 
H'^X{Pq)/i2{q)) + 0{H^) provided (g), (kq), and {k^q) 
are all finite. Therefore H and H are zero for A smaller 
than a critical value Ac with Ac = 1/K and K = (kq) / (q) 
and non-zero otherwise. The expression Ac — K^^ for 
the critical point holds true as long as K is finite. Since 
the Hamming distance H can be positive only ii K > 2, 
iVm ~ HN for finite N should be small in E. coli that 
has the value K ~ 1.08 and can be large, of order N , for 
A > 0.42 in S. cerevisiae that has K ~ 2.35. Although 
the Hamming distance is not necessarily of order N~'^ at 
Ac, one finds the value of A for which — 1 very close 
to the value ~ 0.42 in the latter. The in-degree k 
and the out-degree q show no significant correlation in 



the two networks according to our analysis not presented 
here, that is, Pd{k, q) ~ Pd{k)Pd{q) , which yields (kg) ~ 
{k){q) and K ~ (k). 

Comparison with randomized networks 

Next we studied the same dynamics in two kinds of ran- 
domized networks derived from the regulatory networks 
of E. coli and S. cerevisiae. The first type of randomized 
graphs (type I) are constructed by the repetition of re- 
moving an edge connecting nodes vi and wi and creating 
a new one between V2 and W2 , where both vi and V2 had 
at least one outward edge and the node pair V2 and W2 
were not connected before this change. Thus these type- 
I randomized networks have the same number of nodes, 
edges, and TP's as the original networks, but the edges 
connect randomly-chosen pairs of TF and target gene. 
Our results for TVm and P{tr) are shown in Fig. El For 
the type-I randomized graphs derived from E. coli, 
is substantially suppressed as compared with the origi- 
nal network. In the type-I random graphs derived from 
S. cerevisiae, increases much more rapidly passing 
A ~ 0.3. The relaxation time distribution for the ran- 
dom graphs from E. coli is broader than for the original 
network but still decays faster than that for 5". cerevisiae. 
The type-I randomization does not change significantly 
the relaxation time distribution for S. cerevisiae. 

The type-II randomized graphs we considered are con- 
structed by exchanging the end points of two edges: Two 
randomly chosen edges ei — {vi,wi) and 62 = (^2,^2) 
are replaced by e[ = {vi,W2) and e'2 — (v2,'Wi), respec- 
tively. These graphs preserve the joint degree distribu- 
tion Pd{k,q), but their local connectivity patterns may 
be different from that in the original network. We present 
the plots of A^m and P(tr) in Fig.|21 This type-II random- 
ization does not change the relaxation time distribution 
for S. cerevisiae neither. Thus much faster decay of the 
relaxation time in the original and randomized networks 
for E. coli than in those for S. cerevisiae can be ascribed 
to the much lower mean connectivity, (fc) ~ 1.24, of the 
former than that of the latter, (fc) ~ 2.73. Interestingly 
the quantities N^x and P{t-[) for these randomized graphs 
agree well with those for the original network of 5*. cere- 
visiae, but not for E. coli: This implies that it is the de- 
gree distribution that is mainly responsible for the spread 
of mutation in S. cerevisiae while other (local) structural 
factors must be important in E. coli. 

Abundance of single-element circuits in E. coli 

One might expect that circuits (directed closed paths) 
in the regulatory network play an important role for the 
spread of mutations, because in networks with a tree- 
structure, i.e., without circuits, point mutations spread 
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FIG. 3: Network structure dependence of mutation spread. 
The regulating rules are given by fi{cr) — a or 1 — a for 
nodes i's with one input and fi — 1 or for nodes i's with 
no input. Thus a mutated regulator necessarily makes its 
target node mutated at the next time step. Time evolution of 
A(Ti = |o"i — (Ti| for each node is shown in tables, (a) No circuit 
(tree structure). All nodes recover at t = 3 and thus the 
Hamming distance H is zero, (b) A circuit of length 3. The 
point mutation circulates with period 3, resulting in iif = 1/3. 
(c) A single-element circuit together with tree structure. All 
nodes are mutated at t = 2 and thus H = 1. 




2 4 6 8 10 12 14 16 18 20 
shortest circuit length 

FIG. 4: Organization of the core in E. coli and S. cerevisiae. 
(a) Core of E. coll. It consists of 57 nodes and 84 edges, (b) 
Core of S. cerevisiae. It has 63 nodes and 167 edges, (c) 
Histogram of the shortest circuit lengths. In E. coli, a circuit 
longer than 1 is not observed but all 54 circuits are single- 
element ones. In S. cerevisiae, 836 pairs of nodes among all 
possible 1953 pairs in the core are connected by circuits and 
the shortest circuit length ranges from to 19. 
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without circulation and a node that is mutated will re- 
cover at the next time step and never become mutated 
again as indicated in Fig.|21(a). The nodes on a circuit, 
on the other hand, can return to a mutated state even 
after recovery [Fig. |21 (b)]. The nodes lying on circuits 
or those on bridges connecting distinct circuits can in 
principle switch their status permanently and thus they 
can be considered as comprising a core in the dynamics 
of mutation spread. As a subnetwork including all such 
circuits and the bridges connecting them, we define the 
core of a network as the maximal subgraph in which each 
node has at least one inward edge coming from and at 
least one outward edge incident to an element of the core. 

By deleting the edges having at either end a node that 
does not meet the requirement for the core elements, we 
found the core subnetwork in the regulatory networks of 
E. coli and S. cerevisiae. Note that if an edge has the 
same node at both ends, the node, which regulates it- 
self, becomes the element of the core. The relevance of 
the core to the mutation spread dynamics can be under- 
stood e.g., by investigating the relaxation time distribu- 
tion P(tr) in S. cerevisiae depending on the location of 
the initial point mutation. Our analysis shows that ini- 
tial mutations in the core lead to a qualitatively equal 
(power-law with the same exponent) distribution of the 
relaxation time. On the other hand, initial mutations in 
the output module, consisting of all nodes that have in- 
ward edges coming from the nodes in the core and their 
edges, decay very fast since the output module has a tree 
structure and cannot cause mutations in the core. 

The organization of the core turns out to be very differ- 
ent in E. coli and S. cerevisiae as shown in Fig. 0] (a) and 
(b), respectively. Most of all, the nodes are much more 
densely connected in S. cerevisiae than in E. coli. This 
difference can be first ascribed to different mean connec- 
tivities of the nodes in the core: it is about 1.47 in E. coli 
and 2.65 in S. cerevisiae. However, a more striking differ- 
ence exists in their core organization. In E. coli, all 54 cir- 
cuits are identified, all of which are single-element circuits 
representing self-regulation. There are no circuits whose 
length (i.e the number of edges on the cycle) is larger than 
1 On the contrary, only one or two single-element 
circuits are formed in its randomized graphs. This orga- 
nization of circuits in E. coli is also contrasted with the 
one in 5*. cerevisiae. We computed the shortest circuit 
for each pair of nodes in the core and counted the num- 
bers of node pairs for each given shortest-circuit length. 
The distribution of shortest-circuit length obtained for 
S. cerevisiae is broad as shown in Fig. 01(c). We propose 
that the presence of single-element circuits in E. coli is 
the main reason for the enhancement of iVm of E. coli 
compared with both of its randomized graphs. Once a 
node i regulating itself is mutated, the input configura- 
tions to the regulating rule fi are necessarily different 
between the normal-mutant pair (S, S) since it is guar- 
anteed that at least one of its regulators, the node i itself. 



is mutated. Recalling that a node can be mutated at the 
next time step only if the input configurations from the 
normal-mutant pair are different, one can see that single- 
element circuits have a higher probability to be mutated 
than nodes which do not regulate themselves [See Fig. |3| 
(c)]. Therefore networks with more single-element cir- 
cuits can be more adaptive. 

In the core of E. coli network, 54 edges are used for 
single-element circuits and the remaining 30 edges con- 
nect pairs of distinct nodes. As a result, the network 
has many isolated nodes and few small connected com- 
ponents, resulting in the rapid decay of the relaxation 
time. In Fig. El (c), we find that the relaxation times 
observed in E. coli are mostly 1 or 2. From this, we 
can analytically predict the value of Nm as a function of 
A. Suppose Njn{t) saturates no later than time 2. From 
Eq. Ill), H{1) = XKN-^ + 0{N-^) since H{0) = N'^ 
and 

N,^c^ NH{2) ~ NXKH{1) ^ X^K'^. (2) 

This is in good agreement with the true value as shown 
in Fig. El (a). 

Power-law in-degree distribution in S. cerevisiae 

In S. cerevisiae, the most significant dynamical fea- 
ture that we found and that we need to explain is the 
slower increase of N-^i with A as compared with the type- 
I randomized graph, shown in Fig. El (b). Contrary to 
the type-II randomized graphs, those of type-I do not 
preserve the degree distribution of the original network. 
From this, we can conjecture that the degree distribution 
of 5*. cerevisiae causes the slow increase of 7V,n. To check 
this, we analyze in detail the dependence of the Hamming 
distance on the degree distributions. 

With uncorrelated in- and out-degree as is the case 
in the regulatory networks considered here, Eq. is 
reduced to H{t) ~ H{t) and 

H{t + l) = \Y,[l-{l-H{t)f]P,{k)- (3) 

A; 

Thus the in-degree distribution Pd{k) determines the be- 
havior of the Hamming distance H{t). The in-degree 
distributions of E. coli and S. cerevisiae shown in Fig. [SI 
(a) are quite different from each other. The maximum 
degree is 31 in S. cerevisiae while it is only 6 in E. coli. 
Furthermore, the log- log plot of Pd{k) in S. cerevisiae in- 
dicates that Pd{k) ^ k^'^ with 7 ~ 2.7(2). The functional 
form of Pd{k) for E. coli is hard to determine because of 
the small range for observable k values. Note that the 
in-degree distribution of the type-I randomized graphs 
obey a Poisson distribution, Pd{k) = {k)'^e^^^'^ /k\. Let 
us consider an in-degree distribution which has a power- 
law tail, i.e., Pd{k) ~ k~'^. Then, we find from Eq. Q 
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FIG. 5: Connectivity pattern and its effect on tfie critical 
befiavior of the Hamming distance, (a) In-degree distribu- 
tions Pdik) for E. coli and S. cerevisiae. For S. cerevisiae, 
its asymptotic befiavior is a power-law, Pd{k) ~ k ' with 
7 ~ 2.7(2). On the other hand, the observed values of k are 
only up to 6 and so it is hard to discern the functional form 
of Pd{k) in E. coli. (b) Hamming distance H as a. function of 
A numerically obtained from Eq. Q with Pd{k) of the static 
model 18], which has a power- law tail as Pdik) ~ k ' with 
the exponent 7 tunable. The inset shows that H ^ IS. com- 
monly for 7 -+ 00 and 7 = 3.5, and that H ~ A'^ for 7 = 2.5, 
in agreement with Eq. 



that the Hamming distance in the stationary state be- 
haves as, H ^ /S.^ for A larger than the critical value Ac 
with A = A/ Ac — 1 and the critical exponent (3 given by 



1 (7 > 3), 

1/(7-2) (2<7<3). 



(4) 



The derivation of Eq. is given in Appendix. We re- 
stricted the range of 7 to 7 > 2 because the mean connec- 
tivity diverges with 7 < 2. When the in-degree is subject 
to a Poisson distribution or an exponentially-decaying 
distribution, it corresponds to 7 — > cxd and the critical 
behavior is the same as that for 7 > 3. We present the 
numerical solution to Eq. (PJ in Fig. |31 (b) for 7 — > 00 
(Poisson distribution), 7 = 3.5, and 7 = 2.5. 



The increase of (3 with decreasing 7 below 7 = 3 in- 
dicates a difference in the behavior of the Hamming dis- 
tance near the critical point between networks with 7 > 3 
and those with 2 < 7 < 3. Suppose we have two networks 
with a power-law in-degree distribution Pd{k) ^ k~'^: 
One has 7 = 3.5 and the other has 7 — 2.5, and both 
have (k) = 4. Then, in the region < A = A/Ac - 1 <C 1, 
the Hamming distance behaves as ~ A for 7 — 3.5 
and H ~ A^ for 7 — 2.5: the former increases more 
rapidly than the latter in the region A <§; 1. Also the re- 
gion where the Hamming distance remains non-zero but 
small, e.g., H < 0.05 is larger with 7 = 2.5 than with 
7 = 3.5: it is given by A e (0.25 : 0.29] with 7 = 3.5 
and A G (0.25 : 0.35] with 7 — 2.5. Such dependence of 
the Hamming distance on the in-degree exponent 7 can 
thus explain different network responses between S. cere- 
visiae and its type-I randomized graphs. It is the broad 
in-degree distribution with 7 = 2.7(2) that makes the 
number of mutated elements increase with A more slowly 
than in the corresponding type-I randomized graphs that 
have 7 — > 00. Due to such a slow increase of the Ham- 
ming distance, S. cerevisiae can keep the size of mutation 
small for a wider range of the parameter p or A, which 
would be much larger with random structures. 



CONCLUSION 

We performed numerical experiments - spread of mu- 
tation - to probe the dynamic stability of the recently- 
unveiled networks of gene transcriptional regulation of 
E. coli and S. cerevisiae and provided analytical confir- 
mation for the results by analyzing their structural fea- 
tures. While the small number of edges per node in E. 
coli fundamentally prohibits a global spread of mutation, 
a relatively large number of edges in S. cerevisiae enables 
a global mutation conditionally depending on the regu- 
lating rules. We further identified the relevant structural 
features which are distinguished from those of random 
graphs: All circuits of the regulatory network of E. coli 
are single-element circuits and the in-degree distribution 
of S. cerevisiae takes a power-law form. Single-element 
circuits in E. coli have higher probability to be mutated 
than nodes without self-regulation. The broad in-degree 
distribution in 5*. cerevisiae smoothens the increase of 
the number of mutated elements. This increase would be 
sharper for an exponential distribution, as is the case in 
the random graphs. 

These biological networks appear to follow design prin- 
ciples that tend to balance the size of mutation. The 
small mean connectivity of the regulatory network of E. 
coli would restrict the size of mutations drastically, which 
is compensated by the abundance of single-element cir- 
cuits that lead to the required enhancement of the mu- 
tation size. In the case of S. cerevisiae, its global charac- 
teristics of the regulatory network, a mean connectivity 
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larger than 2, would lead to a very large mutation size, 
but a very heterogeneous interconnectivity pattern sup- 
presses it. These local structural features demonstrate 
that both genetic networks have evolved, in spite of the 
restrictions imposed by the global characteristics, in such 
a direction that they can stay dynamically between stable 
(i.e., rarely mutated on a global scale) and unstable (eas- 
ily mutated). Being neither stable nor unstable appears 
to be necessary for living organisms to maintain their sta- 
ble internal state and adapt itself to fluctuating external 
environment simultaneously. Therefore our finding sug- 
gests that such a marginal dynamic stability of the whole 
system is supported by a selected structural organization 
of the internal systems on smaller scales, as the transcrip- 
tional regulatory network studied in this work. While we 
have concentrated only on the average in-degree, the or- 
ganization of circuits, and the in-degree distribution of 
the network, further structural analysis will be helpful to 
illuminate how structure supports function. 

Wc thank Uri Alon and Richard A. Young for allow- 
ing us to use their data. This work was supported by 
Deutsche Forschungsgemeinschaft (DFG). 



DERIVATION OF EQ. (HJl FROM EQ. dsjl 

To find the behavior oi H = limt^oo H{t) as a function 
of A near the critical point Ac — (k)^^, we set H{t + 1) = 
H{t) = H and expand Eq. ^ for small H, which leads 
to 



(5) 



n=l 



Here (fc") is the nth moment of the in-degree distribu- 
tion Pdik), i.e., (fc") = J2kk"Pd{k). It is finite for all n 
only if Pd{k) decays exponentially. In this case, all the 
terms in the right-hand-side of Eq. ||SJ) are analytic and 
keeping the first two leading terms, one finds that Eq. 
is expressed as H ^ X{k)H — \ {k'^)H'^ /2. This allows us 
to see that = for A < Ac = {k)^^ and ~ A with 
A = (A - Ac)/Ac for A > Ac. 

When the in-degree distribution is a power-law asymp- 
totically, Pd{k) ~ k~^ , all the moments (fc") are not fi- 
nite: (fc") for n > n^, with = [7 — 2] diverges as 
^max'*'^/!'^ ~ 7 + l)j where [x] is the smallest integer not 
smaller than x and fcmax is the (average) largest in-degree. 
The largest in-degree diverges as N^/'^''~^\ which is de- 
rived from the relation ^k>k„,^^ ^d{k) ^ N~^. Thus 
(fc") _ Such diverging terms are ar- 

ranged as H^-i E„>„. (-1)"+' [k^,^H]"-^+^/[nl{n--f+ 
1)] in the right-hand-side of Eq. |(SJ). Here the summa- 



tion converges to a constant in the limit fcmax^^ — > 00 
due to alternating signs and fast decay of the coeffi- 
cients . Thus the small- if expansion of Eq. © reads 
a.sH = AX;^Ll(-l)"^^('t")^f"/^^! + A(constant)i^^-l-|- 
• • • .. The H^^^ term is relevant to the critical be- 
havior of for 7 < 3 since it holds for 7 < 3 that 
H ~ X{k)H+X{coTLS\..)H'<-^, yielding H ~ A^/^'''-^). On 
the other hand, the linear and quadratic terms are rele- 
vant for 7 > 3 as for exponentially-decaying in-degree dis- 
tributions. In summary, the Hamming distance H with 
a power-law in-degree distribution Pd{k) ~ k~^ behaves 
near the critical point as 
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